Weblogs as Market Indicators: Tracking Reactions to Issues and Events

نویسندگان

  • Richard Tong
  • Mark Snuffin
چکیده

We have an ongoing interest in the large-scale analysis of consumer-generated media, of which weblogs are currently of special concern. In this position paper, we describe how our data gathering and analysis system, called T2TM, is being used to collect weblog entries and analyze them for reactions to issues and events in the commercial and government arenas. The paper includes preliminary lessons learned in trying to apply our sentiment analysis techniques to the highly informal language usage found in weblogs. Operational Motivation The blogging phenomenon is now well documented, with Technorati, for example, reporting that “the blogosphere continues to double every 5.5 months,” and that they are “tracking about 900,000 blog posts created every day.” This outpouring of commentary on anything and everything of interest to individual consumers adds yet another dimension to the set of large scale “conversations” that the Internet supports. The challenge is to see whether this on-line material tells us anything of value about the reactions that consumers have to issues and events. In particular, whether weblogs provide us with market indicators that can enhance and complement more traditional marketing research techniques. As in our previous work (Tong and Yager, 2004), we are primarily concerned with large-scale behaviors rather than analysis of any specific posting, and with the general sentiment expressed rather than the raw mentions of topics. This kind of analysis is of potential value in both commercial and government sectors. It can be used to gauge reaction to new products, or assess the impact of an advertising campaign, as well as to track attitudes towards US foreign policy, or understand the sentiment “on the street.” Copyright © 2006, American Association for Artificial Intelligence (www.aaai.org). All rights reserved. Collecting and Processing Weblogs From one perspective, weblogs for us are just another online source. We collect them either directly using RSS/Atom feeds, or by using one or more of the blog search engines. From another, though, the highly idiosyncratic nature of weblogs, both with respect to language usage and posting structure, makes the application of standard text analysis techniques problematic, and has led us to explore the use of simpler pattern matching techniques. The goal is to balance the competing demands of accuracy and processing speed, while maintaining our ability to say something meaningful about the sentiment being expressed as we aggregate across time and postings. In the weblog experiments reported here, we have focused on movies, since they are a common theme in weblog postings and engender plenty of evaluative language. In addition, movie releases are well-defined events with public-domain information, such as casting, advertising budgets and box-office receipts, all of which can be used to interpret and validate the signals we extract from the weblogs. The basic method we use to analyze weblog postings first applies a series of segmentation rules to the original webpage. The objective is to identify content bearing elements so that subsequent processing is applied only to those page elements that are the original contribution of the weblog author. After conversion to XML, the segmented page is analyzed using a sequence of patterns that look for affective language use in combination with various movie descriptors. Each pattern is differentially weighted, and the evidence from each is combined to give an overall sentiment score. Our experiments to date have focused on repurposing families of lexico-syntactic patterns that we developed for other online forums, such as bulletin boards. The key technical challenge is to define an appropriate context for the application of the patterns, and, so far at least, we have been primarily exploring the use of variable length windowing techniques. The figure below shows example sentiment timelines for the movie Hustle and Flow, which was released on July 22, 2005. The upper curve (colored black in the original) shows the positive sentiment. The lower curve (colored red in the original) shows the negative sentiment.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

وب سنجیِ صفحات وب فارسی مرتبط با تغذیه براساس معیار سیلبرگ

Background and Aim: Considering the potential damages caused by inaccurate, inadequate and incomplete information published in web pages, the aim of this study was to evaluate Persian-language web pages containing nutritional information, using Silberg criteria. Materials and Methods: Internet pages related to nutrition were found in “peyvandha.ir” and by searching 20 nutrition-related keywo...

متن کامل

Event Intensity Tracking in Weblog Collections

Event tracking is the task of discovering temporal patterns of events from text streams. Existing approaches for event tracking have two limitations: scalability and their inability to rule out non-relevant portions within texts in the stream ‘relevant’ to the event of interest. In this study, we propose a novel approach to tackle these limitations. To demonstrate our approach, we track news ev...

متن کامل

Impact of Terrorism, Political System and Exchange Rate Fluctuations on Stock Market Volatility

Terrorism, political system instability and currency rate fluctuations are the three most evident issues of 21st century. In this study, comparative analysis is performed to check the impact of all these issues on PSX Volatility. EGARCH (1,1) approach is used on four different kinds of data collected from 1st January 2000 to 31st December 2015. Terrorist events, FX return fluctuations with rest...

متن کامل

Ranking and Managing Stock in the Stock Market Using Fundamental and Technical Analyses

The stock selection problem is one of the major issues in the investment industry, which is mainly solved by analyzing financial ratios. However, considering the complexity and imprecise patterns of the stock market, obvious and easy-to-understand investment rules, based on fundamental analysis, are difficult to obtain. Fundamental and technical analyses are two common methods for predicting th...

متن کامل

Ranking and Managing Stock in the Stock Market Using Fundamental and Technical Analyses

The stock selection problem is one of the major issues in the investment industry, which is mainly solved by analyzing financial ratios. However, considering the complexity and imprecise patterns of the stock market, obvious and easy-to-understand investment rules, based on fundamental analysis, are difficult to obtain. Fundamental and technical analyses are two common methods for predicting th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006